Integrated Survey Data

Overview and conditions of access

2025-10-01

Plan of the presentation

  1. Most common survey with integrated data
  2. Typical data integrated with surveys
  3. Accessing secure integrated dataset

Introduction

Integrated data: - When we add non survey data to survey data

  • Whether part of the original data collection or not

  • Whether primary or secondary

  • Whether same unit of analysis or not

  • Validation or enhancement (Benzeval et al 2020)

  • Typically administrative records, measured data, social media data

  • Examples include accelerometer data, genetic data, individual NHS data, social security data..

  • This talk deals with integrated data available at the UK Data Service mostly

1. Which survey include integrated data?

Overview

  • Partly depends on the kind of data linked to surveys

  • … And scientific teams that performed the linkage ie research based vs

  • Major longitudinal studies

    • Birth Cohort studies
    • Next Steps and ELSA
    • Understanding Society
  • A few large scale cross-sectional government surveys

    • ASHE (Annual Survey of Hours and Earnings)
    • Health Survey for England
    • Family Resources Survey

Birth cohort studies

  • Follow a sample of individuals* over their whole life
  • Born on a specific week of 1958(NCDS), 1970(BCS), 1989-90 (Next Steps), 2000 (MCS), 2026 (?)
  • MCS
    • ~ 19,000 children originally (between June 2001 and Jan 2003)
    • 7 ‘sweeps’ 9 months then at 3, 5, 7, 11, 14, years old
    • parent and child interviews
    • Focuses on education, skills and health, truancy, cognitive ability biological measurements in additional to traditional socio-economic data
    • Has been widely lined range of analysis on factors vs outcomes

Understanding Society

  • The largest UK longitudinal study

  • Initial sample size: 40K households, 100K individuals

  • 14 yearly waves so far: 2009-2023; includes BHPS data 1991-2002

  • Ethnic minority boost samples; Innovation Panel

  • Very wide range of topics covered:

    • Employment, income, benefits, savings, debt, and assets
    • Health, well-being, and health behaviours
    • Housing, housing costs, and dwelling characteristics
    • Family, partnerships, caring responsibilities,
    • Education, training
    • Expenditure, consumption, deprivation
    • Social attitudes, values, political opinions
    • Transport, mobility, and commuting patterns
    • Environmental behaviours, and related attitudes

2. What kind of data is integrated with UKDS surveys?

Overview

  • Administrative records

  • ie data collected by a public ie the state controlled authority: government department, the NHS

  • Health: NHS SHS: medical records ie in/outpatient attendance hospital episodes, maternity

  • Education: DofE, National Pupil Database, school attendence; school profile/teacher survey; distance to grammar school; student loan data, OFSTED data

  • Pollution; green space deciles; PAYE data

  • Social media/Digital trace

What is on offer: examples

1. Genetic risk data

  • Polygenic scores (PGI) about health and social outcomes

  • Gene combinations associated with probability of certain outcomes

    • 45 traits: ie health outcomes and behaviour; mental health and personality traits; Social outcomes

    • Available on the Birth Cohorts and Next Steps datasets

    • Subsamples limited to ‘Europeans’ from a genetic perspective

2. Hospital episodes data

  • NHS data about all hospital admissions in England.
  • Four datasets:
    • Episodes of using: Accident and Emergency ; Admitted Patient Care; Adult Critical Care; Outpatients
    • Mostly available for 2007/9-2023
  • Data on diagnosis, maternity, mortality, mental health, treatment’s length, deprivation etc.
  • Available for the NCDS Birth Cohort

3. School inspection data

  • OFSTED ‘State of the nation’: anonymised data on latest schools inspections outcomes of 22,000 open schools

  • Linked with the MCS, currently covers years 2005 to 2019

  • Data on a wide range of topics. such as:

    • Quality of teaching, learning and assessment
    • Effectiveness of leadership and management
    • Pupils’ achievement (aggregated) (2005-2015)
    • Behaviour and safety of pupils (2005-2015)

4. NEST pension data

  • Main employer pensions scheme for UK employees

  • Covers 1,000,000 employers, 11 millions employees

  • Linked to consenting Understanding Wave 11 respondents (about 12,000)

  • Data about:

    • Employer and employee characteristics
    • Current pension status
    • Pension contributions characteristics

5. Is that everything?

  • List of additional admin data available (ie DVLA/understanding society)
  • Highlights from ReShare

3. How to access integrated data

Accessing secure data

  • A couple of slides on this (I will contact Essex as suggested )

Additional resources